binding site
Supplementary Material AAdditional Results
A.1 Molecule Design We present more examples of generated molecules by our method and the CNN baseline liGAN. We select 6 molecules with highest binding affinity for each method and each binding site. The 3 additional binding sites are selected randomly from the testing set. By comparing the samples from two methods, we can find that the 3D molecules generated by our method are generally more realistic, while molecules generated by the baseline have more erroneous structures, such as bonds that are too short and angles that are too sharp. Besides, molecules generated by our method are more diverse, while the 3D atom configurations generated by the baseline are often similar.
Full-Atom Peptide Design with Geometric Latent Diffusion
Peptide design plays a pivotal role in therapeutics, allowing brand new possibility to leverage target binding sites that are previously undruggable. Most existing methods are either inefficient or only concerned with the target-agnostic design of 1D sequences. In this paper, we propose a generative model for full-atom Peptide design with Geometric LAtent Diffusion (PepGLAD) given the binding site. We first establish a benchmark consisting of both 1D sequences and 3D structures from Protein Data Bank (PDB) and literature for systematic evaluation. We then identify two major challenges of leveraging current diffusion-based models for peptide design: the full-atom geometry and the variable binding geometry. To tackle the first challenge, PepGLAD derives a variational autoencoder that first encodes full-atom residues of variable size into fixed-dimensional latent representations, and then decodes back to the residue space after conducting the diffusion process in the latent space. For the second issue, PepGLAD explores a receptor-specific affine transformation to convert the 3D coordinates into a shared standard space, enabling better generalization ability across different binding shapes. Experimental Results show that our method not only improves diversity and binding affinity significantly in the task of sequence-structure co-design, but also excels at recovering reference structures for binding conformation generation.